NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Stochastic Optimal Control Matching

Domingo-Enrich, Carles; Han, Jiequn; Amos, Brandon; Bruna, Joan; Chen, Ricky (December 2024, Neural Information Processing Systems)

Full Text Available
Stochastic Optimal Control Matching

Domingo-Enrich, Carles; Han, Jiequn; Amos, Brandon; Bruna, Joan; Chen, Ricky (December 2024, NeurIPS)

Full Text Available
Multisample Flow Matching: Straightening Flows with Minibatch Couplings

Pooladian, Aram-Alexandre; Ben-Hamu, Heli; Domingo-Enrich, Carles; Amos, Brandon; Lipman, Yaron; Chen, Ricky T. (May 2023, ICML 2023)

Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constructing probability paths. We propose Multisample Flow Matching, a more general framework that uses non-trivial couplings between data and noise samples while satisfying the correct marginal constraints. At very small overhead costs, this generalization allows us to (i) reduce gradient variance during training, (ii) obtain straighter flows for the learned vector field, which allows us to generate high-quality samples using fewer function evaluations, and (iii) obtain transport maps with lower cost in high dimensions, which has applications beyond generative modeling. Importantly, we do so in a completely simulation-free manner with a simple minimization objective. We show that our proposed methods improve sample consistency on downsampled ImageNet data sets, and lead to better low-cost sample generation.
more » « less
Full Text Available
On Energy-Based Models with Overparametrized Shallow Neural Networks

Domingo-Enrich, Carles; Bietti, Alberto; Vanden-Eijnden, Eric; Bruna, Joan (June 2021, International Conference on Machine Learning)
null (Ed.)
Energy-based models (EBMs) are a simple yet powerful framework for generative modeling. They are based on a trainable energy function which defines an associated Gibbs measure, and they can be trained and sampled from via well-established statistical tools, such as MCMC. Neural networks may be used as energy function approximators, providing both a rich class of expressive models as well as a flexible device to incorporate data structure. In this work we focus on shallow neural networks. Building from the incipient theory of overparametrized neural networks, we show that models trained in the so-called “active” regime provide a statistical advantage over their associated “lazy” or kernel regime, leading to improved adaptivity to hidden low-dimensional structure in the data distribution, as already observed in supervised learning. Our study covers both maximum likelihood and Stein Discrepancy estimators, and we validate our theoretical results with numerical experiments on synthetic data.
more » « less
Full Text Available
A mean-field analysis of two-player zero-sum games

Domingo-Enrich, Carles; Jelassi, S; Mensch, A; Rotskoff, G; Bruna, J (December 2020, Advances in neural information processing systems)
null (Ed.)
Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror descent. Yet this approach does not scale to high dimensions. To address this limitation, we parametrize mixed strategies as mixtures of particles, whose positions and weights are updated using gradient descent-ascent. We study this dynamics as an interacting gradient flow over measure spaces endowed with the Wasserstein-Fisher-Rao metric. We establish global convergence to an approximate equilibrium for the related Langevin gradient-ascent dynamic. We prove a law of large numbers that relates particle dynamics to mean-field dynamics. Our method identifies mixed equilibria in high dimensions and is demonstrably effective for training mixtures of GANs.
more » « less
Full Text Available
Extra-gradient with player sampling for provable fast convergence in n-player games

Jelassi, Samy; Domingo-Enrich, Carles; Scieur, Damien; Mensch, Arthur; Bruna, Joan (January 2020, international conference on machine learning)

Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e.g. when training GANs. In this paper, we analyse a new extra-gradient method for Nash equilibrium finding, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits a better rate of convergence than full extra-gradient for non-smooth convex games with noisy gradient oracle. We propose an additional variance reduction mechanism to obtain speed-ups in smooth convex games. Our approach makes extrapolation amenable to massive multiplayer settings, and brings empirical speed-ups, in particular when using a heuristic cyclic sampling scheme. Most importantly, it allows to train faster and better GANs and mixtures of GANs.
more » « less
Full Text Available

Search for: All records